Understanding Convolution: From a Simple Mathematical Operation to Computer Vision¶

How Convolution Works?¶

Let's say we have two vectors

$$X = [1, 2, 3, 4, 5]$$$$Y = [6, 7, 8, 9, 10]$$

Based on them we can create a matrix of dimensions $x \times y$, in which:

$$a_{ij} = X_i*Y_j$$
In [2]:
z = np.outer(x, y)
print(z)
[[ 6  7  8  9 10]
 [12 14 16 18 20]
 [18 21 24 27 30]
 [24 28 32 36 40]
 [30 35 40 45 50]]

The result of the convolution of vectors X and Y is a vector Z, the i-th element of which is the sum of the elements located on the corresponding diagonals of the matrix thus formed.

$$Z_1 = 6$$$$Z_2 = 12+7 = 19$$$$Z_3 = 18+14+8 = 40$$$$Z_4 = 24+21+16+9 = 70$$$$Z_5 = 30+28+24+18+10 = 110$$$$Z_6 = 35+32+27+20 = 114$$$$Z_7 = 40+36+30 = 106$$$$Z_8 = 45+40 = 85$$$$Z_9 = 50$$
In [3]:
print(np.convolve(x,y))
[  6  19  40  70 110 114 106  85  50]

2D Analogue¶

image.png

Fig.1. Image consisting of randomly generated pixels

In [7]:
from scipy.signal import convolve2d

f = np.array([0.11 for x in range(9)]).reshape((3,3))

result = convolve2d(random_matrix, f, mode='valid')


Fig.2. Figure 1 (on right) after performing a convolution operation on it (on left)

In [8]:
from scipy.signal import convolve2d

f = np.array([0,1,0,1,-4,1,0,1,0]).reshape((3,3))

result = convolve2d(random_matrix, f, mode='valid')


Fig.3. Image 1 after convolution on it with blur filter (on left) and with edge sharpening filter (on right)

In neural networks, we use multiple convolutional layers. Why?¶

In [9]:
from scipy.signal import convolve2d

f = np.array([0,1,0,1,-4,1,0,1,0]).reshape((3,3))

result = convolve2d(random_matrix, f, mode='valid')

result = convolve2d(result, f, mode='valid')
result = convolve2d(result, f, mode='valid')
result = convolve2d(result, f, mode='valid')
result = convolve2d(result, f, mode='valid')
result = convolve2d(result, f, mode='valid')


Fig.4. Image 1 after convolution operation on it but with single (on right) and with multiple convolution layers (on left)

image.png

image.png

image-3.png

Applications of Convolution¶

Smoothing charts

Polynomial Multiplication

Probability Calculus

Photo editing

Neural Networks Architecture

Questions¶

  1. What is the convolution?
  2. Do the vectors we convolve have to be the same length?
  3. Where is convolution used?
  4. Can convolution be performed only on one-dimensional objects?
  5. What element has the greatest impact on how the image will change after convolution?
  6. Why do we use convolutional layers in neural networks?
  7. Why do we use multiple convolutional layers in a single neural network architecture?
  8. Give an example of a neural architecture that includes convolutional layers
  9. Is convolution a time-consuming operation?
  10. How many layers can convolutional neural networks currently have?

Bibliography¶

1.https://medium.com/latinxinai/convolutional-neural-network-from-scratch-6b1c856e1c07 2.https://www.youtube.com/watch?v=KuXjwB4LzSA ( 3.https://doi.org/10.48550/arXiv.1409.1556 4.https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf 5.https://intellipaat.com/community/11105/why-are-inputs-for-convolutional-neural-networks-always-squared-images 6.https://en.wikipedia.org/wiki/AlexNet 7.https://github.com/fastai/fastbook/blob/master/01_intro.ipynb